import { Callout } from "nextra/components";

# GPU Instance Configuration

<Callout type="warning">
  This feature is currently in beta and may be subject to change.
</Callout>

## Overview

Hatchet supports GPU-accelerated workloads using NVIDIA GPUs. This guide covers GPU configuration options, Docker setup, and available regions. For basic compute configuration concepts, please refer to the [CPU Instance Configuration](cpu-configuration.md) documentation.

## GPU Types and Availability

Hatchet currently supports the following NVIDIA GPU types:

| GPU Model        | Memory | Available Regions                                                         |
| ---------------- | ------ | ------------------------------------------------------------------------- |
| NVIDIA A10       | 24GB   | ord (Chicago)                                                             |
| NVIDIA L40S      | 48GB   | ord (Chicago)                                                             |
| NVIDIA A100-PCIe | 40GB   | ord (Chicago)                                                             |
| NVIDIA A100-SXM4 | 80GB   | ams (Amsterdam), iad (Ashburn), mia (Miami), sjc (San Jose), syd (Sydney) |

## Basic Configuration

```python
from hatchet_sdk.compute.configs import Compute

compute = Compute(
    gpu_kind="a100-80gb",    # GPU type
    gpus=1,                  # Number of GPUs
    memory_mb=163840,        # Memory in MB
    num_replicas=1,          # Number of instances
    regions=["ams"]          # Must be a region that supports your chosen GPU
)
```

## Docker Configuration

### Example Dockerfile

```dockerfile
# Base image
FROM ubuntu:22.04

# Install CUDA and required libraries
RUN apt-get update && apt-get install -y --no-install-recommends \
    cuda-nvcc-12-2 \
    libcublas-12-2 \
    libcudnn8 \
    && rm -rf /var/lib/apt/lists/*

# Your application setup
WORKDIR /app
COPY . .

# Application entrypoint
CMD ["python", "worker.py"]
```

### Best Practices for Docker for GPU Workloads

1. **Selective Library Installation**

```dockerfile
# DO NOT use meta-packages
# ❌ RUN apt-get install cuda-runtime-*

# ✅ Install only required libraries
RUN apt-get install -y \
    cuda-nvcc-12-2 \
    libcublas-12-2 \
    libcudnn8
```

2. **Multi-stage Builds**

```dockerfile
# Build stage
FROM ubuntu:22.04 AS builder
RUN apt-get update && apt-get install -y cuda-nvcc-12-2

# Runtime stage
FROM ubuntu:22.04
COPY --from=builder /usr/local/cuda-12.2 /usr/local/cuda-12.2
```

## Usage in Workflows

```python
from hatchet_sdk import Hatchet, Context

hatchet = Hatchet()

@hatchet.workflow()
class GPUWorkflow:
    @hatchet.step(
        compute=Compute(
            gpu_kind="a100-80gb",
            gpus=1,
            memory_mb=163840,
            num_replicas=1,
            regions=["ams"]
        )
    )
    def train_model(self, context: Context):
        # GPU-accelerated code here
        pass
```

## Memory and Resource Allocation

### Available Memory per GPU Type

- **A10**: 24GB GPU Memory
- **L40S**: 48GB GPU Memory
- **A100-PCIe**: 40GB GPU Memory
- **A100-SXM4**: 80GB GPU Memory

When configuring memory_mb, ensure it's sufficient for both system memory and GPU operations.

## Region-Specific Configurations

### A100-80GB Example

```python
# Multi-region A100-80GB configuration
compute = Compute(
    gpu_kind="a100-80gb",
    gpus=1,
    memory_mb=163840,
    num_replicas=3,
    regions=["ams", "sjc", "syd"]  # Replicas will be randomly distributed
)
```

### A10 Example

```python
# Chicago-based A10 configuration
compute = Compute(
    gpu_kind="a10",
    gpus=1,
    memory_mb=49152,
    num_replicas=2,
    regions=["ord"]
)
```

## Best Practices

1. **GPU Selection**

   - Choose GPU type based on workload requirements
   - Consider memory requirements for your models
   - Factor in regional availability

2. **Docker Optimization**

   - Use specific library versions instead of meta-packages
   - Implement multi-stage builds to reduce image size
   - Only install required CUDA libraries

3. **Region Strategy**

   - Select regions based on data locality
   - Consider backup regions for redundancy
   - Remember that replicas are randomly distributed across specified regions

4. **Resource Management**
   - Monitor GPU utilization
   - Optimize batch sizes for GPU memory
   - Consider CPU and system memory requirements alongside GPU resources

## Common Issues and Solutions

1. **CUDA Version Compatibility**

   ```dockerfile
   # Ensure CUDA toolkit version matches driver
   RUN apt-get install -y cuda-nvcc-12-2
   ```

2. **Library Dependencies**

   ```dockerfile
   # Install commonly needed libraries
   RUN apt-get install -y \
       libcublas-12-2 \
       libcudnn8 \
       nvidia-driver-525
   ```

3. **Region Availability**

Region support is limited while GPUs are in beta. Always verify GPU availability in chosen regions.

Remember to monitor your GPU workload performance and adjust configurations as needed. Consider implementing proper error handling for GPU-related operations and ensure your code gracefully handles scenarios where GPUs might be temporarily unavailable.
